Translation Initiation Sites Prediction with Mixture Gaussian Models
نویسندگان
چکیده
Translation initiation sites (TIS) are important signals in cDNA sequences. Many research efforts have tried to predict TIS in cDNA sequences. In this paper, we propose using mixture Gaussian models to predict TIS in cDNA sequences. Some new global measures are used to generate numerical features from cDNA sequences, such as the length of the open reading frame downstream from ATG, the number of other ATGs upstream and downstream from the current ATGs, etc. With these global features, the proposed method predicts TIS with sensitivity 98% and specificity 92%. The sensitivity is much better than that from other methods. We attribute the improvement in sensitivity to the nature of the global features and the mixture Gaussian models.
منابع مشابه
IMAGE SEGMENTATION USING GAUSSIAN MIXTURE MODEL
Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we have learned Gaussian mixture model to the pixels of an image. The parameters of the model have estimated by EM-algorithm. In addition pixel labeling corresponded to each pixel of true image is made by Bayes rule. In fact, ...
متن کاملImage Segmentation using Gaussian Mixture Model
Abstract: Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we used Gaussian mixture model to the pixels of an image. The parameters of the model were estimated by EM-algorithm. In addition pixel labeling corresponded to each pixel of true image was made by Bayes rule. In fact,...
متن کاملPrediction of translation initiation sites on the genome of Synechocystis sp. strain PCC6803 by Hidden Markov model.
We developed a computer program, GeneHackerTL, which predicts the most probable translation initiation site for a given nucleotide sequence. The program requires that information be extracted from the nucleotide sequence data surrounding the translation initiation sites according to the framework of the Hidden Markov Model. Since the translation initiation sites of 72 highly abundant proteins h...
متن کاملGeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.
Improving the accuracy of prediction of gene starts is one of a few remaining open problems in computer prediction of prokaryotic genes. Its difficulty is caused by the absence of relatively strong sequence patterns identifying true translation initiation sites. In the current paper we show that the accuracy of gene start prediction can be improved by combining models of protein-coding and non-...
متن کاملSpeech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering
Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004